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Abstract 

Background: Aciliini presently includes 69 species of medium-sized water beetles distributed on all continents 
except Antarctica. The pattern of distribution with several genera confined to different continents of the Southern 
Hemisphere raises the yet untested hypothesis of a Gondwana vicariance origin. The monophyly of Aciliini has been 
questioned with regard to Eretini, and there are competing hypotheses about the intergeneric relationship in the 
tribe. This study is the first comprehensive phylogenetic analysis focused on the tribe Aciliini and it is based on 
eight gene fragments. The aims of the present study are: 1) to test the monophyly of Aciliini and clarify the position 
of the tribe Eretini and to resolve the relationship among genera within Aciliini, 2) to calibrate the divergence times 
within Aciliini and test different biogeographical scenarios, and 3) to evaluate the utility of the gene CAD for 
phylogenetic analysis in Dytiscidae. 

Results: Our analyses confirm monophyly of Aciliini with Eretini as its sister group. Each of six genera which have 
multiple species are also supported as monophyletic. The origin of the tribe is firmly based in the Southern 
Hemisphere with the arrangement of Neotropical and Afrotropical taxa as the most basal clades suggesting a 
Gondwana vicariance origin. However, the uncertainty as to whether a fossil can be used as a stem-or crowngroup 
calibration point for Acilius influenced the result: as crowngroup calibration, the 95% HPD interval for the basal 
nodes included the geological age estimate for the Gondwana break-up, but as a stem group calibration the basal 
nodes were too young. Our study suggests CAD to be the most informative marker between 15 and 50 Ma. 
Notably, the 2000 bp CAD fragment analyzed alone fully resolved the tree with high support. 

Conclusions: 1) Molecular data confirmed Aciliini as a monophyletic group. 2) Bayesian optimizations of the 
biogeographical history are consistent with an influence of Gondwana break-up history, but were dependent on 
the calibration method. 3) The evaluation using a method of phylogenetic signal per base pair indicated Wnt and 
CAD as the most informative of our sampled genes. 
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Background 

The tribe Aciliini Thomson, 1867 belongs to the sub- 
family Dytiscinae, comprising most of the larger diving 
beetles within Dytiscidae. Aciliini presently include 69 
described species in seven genera [1]. Adults have gene- 
rally good flight capability and can be found in both 
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temporary and permanent water bodies [2-4]. They are 
distributed worldwide and mostly inhabit standing wa- 
ters like pools, ponds, lakes, swamps and bogs [1,3,4]. 
Both larvae and adults are predatory on smaller aquatic 
invertebrates and the larvae have a characteristic arched 
body shape with a small narrow head. 

The distribution of the tribe covers all main zoo- 
geographical regions except Antarctica, but distribution 
of the genera suggests an ancient origin (Figure 1). 
Thermonectus Dejean, 1833 is the most speciose genus of 
the tribe (with eighteen species) and occurs in the Neotropics 
and the Nearctic north to the Canadian border [2]. The 
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Figure 1 Approximate geographical distribution of Aciliini genera. 
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genera Acilius Leach, 1817 and Graphoderus Dejean, 1833, 
which each include more than ten species, are restricted to 
the Holarctic. Aethionectes Sharp, 1882 is a small Afro- 
tropical genus with eight species, one of which is restricted 
to Madagascar. Sandracottus Sharp, 1882 occurs in the 
Australasian region. Rhantaticus Sharp, 1882 is a widespread 
genus with currently a single recognized species R. congestus 
(Klug, 1833) occurring throughout the Afrotropical, south 
Palearctic, Oriental and Australian regions. However, pre- 
liminary studies suggest that it is a species complex in 
need of revision. Finally, the most recently added genus 
Tikoloshanes Omer-Cooper, 1956 includes the single species 
T. eretiformis Omer-Cooper, 1956. It is rarely collected, never 
in great numbers, and only known from southeastern Africa. 
Tikoloshanes share similarities with the tribe Eretini Crotch, 
1873 as the color and shape are very similar to those of 
Eretes Laporte, 1833, but the genus was proposed as a new 
genus within Aciliini, close to Aethionectes [5]. Tikoloshanes 
has never been included in a phylogenetic analysis using mo- 
lecular data. The tribe Aciliini has not previously undergone 
any major revision or phylogenetic analysis, but the genus 
Acilius was recently revised [3] and its evolutionary his- 
tory inferred with molecular and morphological data [6]. 



Erichson [7] referred species today in Graphoderus, 
Sandracottus and Thermonectus to Hydaticus until 1833 
when Dejean [8] recognized Thermonectus and Grapho- 
derus as separate genera. Aube [9] transferred Thermo- 
nectus to a subgroup of Acilius and Graphoderus to a 
subgroup of Hydaticus. Regimbart [10] was the first au- 
thor who suggested a delimitation of Acilius, Grapho- 
derus and Hydaticus similar to the one used today based 
on characters of the metasternal wing and metatibial 
spurs, but Thermonectus and Sandracottus were in- 
cluded in Graphoderus. A few years later, Sharp [11] 
formalized this view, erected the tribes Thermonectini 
(= Aciliini) and Hydaticini, and recognized the six genera 
of Aciliini which are generally accepted today (except 
the more recently described Tikoloshanes). Hence, since 
the early classification there has been two alternative 
hypotheses. First, the genus Acilius has been regarded as 
the most divergent of the genera based on morphology, 
especially that of the females and the wide body shape. 
This is reflected in, for example, Erichsons [7] classifica- 
tion, with Graphoderus, Sandracottus and Thermonectus 
grouped with each other near Hydaticus. An alternative 
hypothesis is the view of Aube [9] that there are 
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similarities between Acilius and Thermonectus to the 
exclusion of the other genera. These hypotheses are 
mutually exclusive if similarity is to be interpreted as 
evidence of sister group relationships, although they 
were erected prior to the development of a cladistic 
theory and could reflect similarity based on shared pie- 
siomorphic character states. 

The evolutionary relationships of Aciliini genera have 
never been analyzed in detail with modern phylogenetic 
methods. Only a handful of studies with different foci have 
included a few representatives in phylogenetic analyses. 
Millers [12] study on Dytiscinae was based on adult 
morphology and recovered Graphoderus as a group sister 
to the rest of Aciliini, and Acilius and Thermonectus as 
sister clades in agreement with Aube [9]. In a later mor- 
phological study, with a larger taxon sampling for all of 
Dytiscidae and with a focus on the female reproductive 
tract, Miller [13] included all known genera of Aciliini. 
However the genera Aethionectes, Rhantaticus, Sandracot- 
tus and Tikoloshanes were merged into a single terminal 
as they scored identically for all morphological characters 
used. Again, Acilius and Thermonectus came out as sister 
taxa [13]. 

In a molecular study Ribera et al. [14] included single 
representatives of five of the seven Aciliini genera and 
arrived at a somewhat different intergeneric topology. In 
their analysis Acilius instead came out as sister to Gra- 
phoderus, and Thermonectus was placed as sister to this 
clade. Also Rhantaticus and Sandracottus were identified 
as sister groups and, together, sister to the rest of 
Aciliini. Moreover they recovered Eretini nested inside 
Aciliini suggesting the tribe may be paraphyletic. Only 
the first conclusion was strongly supported with poster- 
ior probability of > 0.95. However their main focus was 
on the phylogeny of supra-generic groups of Dytiscidae 
based on two mitochondrial and two nuclear genes, in- 
cluding five species of Aciliini and two species of Eretini. 
Eretini, which includes only the genus Eretes, has been 
proposed to be the sister tribe to Aciliini, since it shares 
similar setae on the metafemur which are apically bifid 
and arranged in a strongly oblique row [12,15]. The 
larvae also share a unique shape, an erratic shrimp-like 
swimming-escape behavior and two pairs of large 
camera-like stemmata which may be an additional apo- 
morphy for Aciliini and Eretini [16]. The prospect of a 
paraphyletic Aciliini with Eretini nested inside was quite 
surprising, but tantalizingly suggests the possibility that 
the enigmatic Tikoloshanes eretiformis is, indeed, closely 
related to Eretini. 

A dated phylogenetic hypothesis with robust support 
is needed in order to test whether the intergeneric rela- 
tionships in the group, and the age of the group, sup- 
ports a Gondwana break-up origin. Both the hypotheses 
outlined above contradict such an origin. Acilius basal 



would infer a Holarctic origin of the group and an Aci- 
lius + Thermonectus sister group relationship would 
imply a much more recent Holarctic-Neotropical con- 
nection and diversification. Neither is consistent with a 
Gondwana break-up origin, which would require basal 
positions of southern hemisphere clades. Fossils in the 
tribe are very sparse and of no direct help to place the 
origin of the group in the Cretaceous as all are of youn- 
ger Neogene or late Paleogene ages. However, the use of 
relaxed clock models and Bayesian frameworks to in- 
clude the calibration uncertainty as prior distributions 
can make use of such fossil to derive age span estimates 
for the basal nodes in a well-supported tree. 

The search for new nuclear coding genes for phylogeny 
reconstruction in different target groups is ongoing [17]. 
Coleoptera does not contain any high priority model 
organisms except perhaps the flour beetle, Tribolium cas- 
taneum, but its complete genome was not entirely se- 
quenced until 2008 [18], which is one reason the selection 
of easily amplifiable single-copy nuclear genes in beetles 
has lagged behind many other groups of organisms. The 
nuclear protein-coding gene CAD, also known as rudi- 
mentary, was for the first time introduced to the insect 
systematics community by Moulton and Wiegmann 
[17]. Since then it was successfully used in e.g. Diptera 
[17], Hymenoptera [19-21], Neuropterida [22], Trichop- 
tera [23] and Lepidoptera studies [24]. CAD performed 
well both separately and in combination with other 
genes as evaluated by Johanson and Malm [23]. Some 
studies to date have used CAD in Coleoptera [25-34], 
mainly in Carabidae, Scolytinae and Silphidae but none 
have yet used CAD in Adephagan water beetle (Hydra- 
dephaga) phylogenetics. 

The CAD gene encodes three enzymes that catalyze 
pyrimidine biosynthesis: carbamoyl phospate synthe- 
tase (CPS), aspartate transcarbamylase (ATC) and 
dihydroorotase (DHO) and with a combined total 
length of about 6.6 kb. Recent studies on Braconidae 
by Sharanowski et al. [35] demonstrated that the CPS 
region contains a high number of parsimony infor- 
mative sites. Wild and Maddison [27] suggested CAD 
to be the highest performing gene fragment for both shal- 
low and deep phylogenetics among eight evaluated nu- 
clear genes in beetles and designed several primer pairs 
for a 2020 bp CPS region, free of introns in most taxa of 
beetles. 

This study is the first comprehensive phylogenetic 
analysis focused on the tribe Aciliini and it is based on 
eight gene fragments. The aims of the present study are: 
1) to test the monophyly of Aciliini, clarify the position 
of the tribe Eretini and to resolve the relationship among 
genera within Aciliini 2) to calibrate the divergence times 
within Aciliini, reconstruct ancestral distributions and 
thereby test the hypothesis of a Gondwana vicariance 
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influence, and 3) to evaluate the utility of CAD for phylo- 
genetic analysis in Dytiscidae. 

Methods 

Taxon sampling 

Taxon sampling was focused on assembling the highest 
species representation as possible for the tribes Aciliini 
and EretinL Our ingroup comprised species from all 
seven genera of Aciliini (61% of the known Aciliini spe- 
cies): Acilius (12 species, 92%), Aethionectes (2 species, 
25%), Graphoderus (10 species, 91%), Rhantaticus (1 spe- 
cies, 100%), Sandracottus (5 species, 31%), Thermonectus 
(11 species, 58%), Tikoloshanes (1 species, 100%), and all 
four species of Eretini represented by the single genus 
Eretes. Our taxa cover all six zoogeographic regions: Ne- 
arctic {Acilius, Graphoderus, Thermonectus), Neotropical 
{Thermonectus), Palearctic {Acilius, Graphoderus, Sandra- 
cottus, Rhantaticus), Afrotropical {Aethionectes, Rhantati- 
cus, Tikoloshanes), Oriental {Rhantaticus, Sandracottus) 
and Australian {Sandracottus, Rhantaticus) (see Additional 
file 1: Table SI). 

Since Rhantaticus congestus is suspected to actually 
represent a species complex we included several speci- 
mens of Rhantaticus from different regions. As out- 
groups to root the tree we used representatives from 
Hydaticini, which are thought to be the closest relatives 
[12], and Dytiscini. 

DNA extraction, amplification and sequencing methods 

For most specimens whole genomic DNA was extracted 
using Qiagen DNEasy kit (Valencia, California, USA) 
and the animal tissue protocol. Thoracic muscle tissue 
was removed from large specimens for extraction. A few 
small specimens were extracted by removing the abdo- 
men and placing the remaining portion of the specimen 
in extraction buffer for incubation. Dissected specimens 
and whole extracted specimens were retained for vou- 
chering in the Museum of Southwestern Biology Div- 
ision of Arthropods (MSB A, K.B. Miller, curator). Some 
DNA was extracted from legs of ethanol-preserved or 
dry-mounted material using the DNEasy DNA extrac- 
tion kit (Qiagen, Valencia, CA) following the manufac- 
turers recommendation, except for that 20 ul of 1 molar 
DTT (dithiothreitol) was added during the lysis stage. 
The GeneMole robot and MoleStrips™ DNA Tissue (Mole 
Genetics) was used for isolation of DNA from some of the 
samples with an elution volume of 100 ul. After extrac- 
tion, legs were returned to the vouchers for archiving. 

Eight gene fragments were selected to infer the phy- 
logeny and divergence times. These included three nuclear 
protein-coding genes, histone 3 (H3), wingless (Wnt) and 
rudimentary (CAD, a part of CPS), two ribosomal genes, 
mitochondrial 16S and nuclear 28S, and the mitochondrial 
protein-coding genes cytochrome c oxidase subunit 



1 (COI) and II (COII). COI was sequenced in two non- 
overlapping fragments. Primers used for amplification and 
sequencing were derived from several sources, but also 
new primers were designed for some taxa (see Additional 
file 2: Table S2). 

Most of the DNA fragments were amplified using 
"Ready-to-go™" PCR Beads from Pharmacia Biotech fol- 
lowing the manufacture s standard protocols. Each 25 ul 
reaction contained 1 ul of 10 uM primer pair mix (x2), 

2 ul of DNA template and 21 ul of water. The thermal 
cycling profile for "Ready-to-go" PCR was 94°C for 
5 min, followed by 40 cycles of 94°C 30 s, 56-50°C for 
30 sec, 72°C for 1 min and finally 72°C for 8 min. Prod- 
uct yield, specificity of amplification and contamination 
were investigated using agarose gel electrophoresis. PCR 
products were purified using ExoFast Cleanup mix (Fer- 
mentas) and cycle sequenced using the same primers as 
in the PCR. For sequencing reactions the ABI BigDye 
Terminator kit ver. 3.1 kit (Applied Biosystems) was 
used. Each sequencing reaction mixture included 1 ul of 
BigDye™ (Applied Biosystems), 1 ul of 1,6 uM primer and 
2-4 ul of PCR product. The sequence cycling profile was 
95°C for 1 min and then 25 cycles of 95°C 30 sec, 50°C 
15 sec and 60°C 4 min. Sequencing products were purified 
using the DyeEx 96 kit and fragments were analyzed on 
an ABI377xl analyzer from Applied Biosystems at the 
Molecular Systematics Laboratory, Swedish Museum of 
Natural History. 

Some fragments were amplified using PCR with TaKaRa 
Amplitaq (Applied Biosystems, Foster City, CA, USA) on 
an Eppendorf Mastercycler ep gradient S thermal cycler 
(Eppendorf, Hamburg, Germany). Amplification condi- 
tions were similar to those used by Miller et al. [6,36,37]. 
Products were purified using ExoSAP-IT (USB-Affyme- 
trix, Cleveland, OH, USA) and cycle sequenced using ABI 
Prism Big Dye (v3.1; Invitrogen, Fairfax, VA, USA) using 
the same primers as for amplification. Sequencing reaction 
products were purified using Sephadex G-50 Medium (GE 
Healthcare, Uppsala, Sweden) and sequenced using an 
ABB 130x1 Genetic Analyzer (Applied Biosystems, Foster 
City, CA, USA) in the Molecular Biology Facility at the 
University of New Mexico. 

Gene regions were sequenced in both directions. Result- 
ing sequence data were examined and edited using the 
program Sequencher 4.10 (Gene Codes Corporation). The 
forward and reverse primers were trimmed from the begin- 
ning and end of each sequence. Sequences are submitted 
to NCBI GenBank under accession numbers KF978819- 
KF979120. 

For some taxa we did not retrieve full length frag- 
ments, probably due to DNA degradation especially for 
the dry-pinned specimens. Sequences from all eight gene 
fragments were obtained from specimens when possible 
and missing data represented 11% at the gene fragment 
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level DNA sequences were aligned with MUSCLE [38] 
in MEGA 5 [39] using the default settings. The protein- 
coding genes contained no gaps except for CAD which 
had a 3, 6 or 9 bp long indel at one position. The final 
alignment of the eight gene segments included 6095 bp 
and all gaps were treated as missing data in the ana- 
lyses. Data tables and matrices were created with Voseq 
1.4.4 [40]. 

Partition finder 

To determine the best-fit partitioning scheme of mo- 
lecular evolution for our dataset we used PartitionFinder 
Vl-1.0.1 [41]. During analyses branch lengths were un- 
linked to allow the program to estimate them independ- 
ently for each subset. The best model, among the ones 
available in MrBayes 3.2 [42], was searched for under 
the greedy search algorithm based on the Bayesian Infor- 
mation Criterion (BIC) model metric. Based on BIC scores 
for each partition, GTR + I + G or HKY + G model were 
the best-scoring models, followed closely by the SYM + I + 
G and HKY + I + G models. We used the partitioning 
scheme and among-site rate variation suggested by Parti- 
tionFinder but instead of selecting one substitution model 
a priori, we used reversible- jump MCMC to allow sam- 
pling across the entire substitution rate model space [43] 
(see Additional file 3: Table S3). Apart from using the 
partitioning scheme suggested by Partition Finder, we 
also tested partitioning our data following a commonly 
suggest scheme with six partitions [37]: 1) 1 st and 2 nd 
codon position of mitochondrial protein-coding genes, 
2) 3 rd codon position of mitochondrial protein-coding 
genes, 3) 1 st and 2 nd codon position of nuclear protein- 
coding genes, 4) 3 rd codon position of nuclear protein- 
coding genes, 5) mitochondrial ribosomal genes (16S), 
6) nuclear ribosomal genes (28S). 

Phylogenetic analyses 

We used Bayesian inference to estimate the phylogenetic 
relationship among taxa. Four taxa were used as out- 
groups: Hydaticus aruspex Clark, 1864, Hydaticus trans- 
versalis Pontoppidan, 1763, Hydaticus leander Rossi, 1790 
and Dytiscus verticalis Say, 1823. Hydaticini is considered 
to be the closest relative to Aciliini + Eretini among the 
outgroups included in this study [12,14]. Data were first 
examined by analyzing the different gene regions sepa- 
rately. The final phylogenetic analysis was conducted on 
the combined data matrix of all eight gene fragments. 

The partitioned data set was analyzed by Bayesian 
methods using Metropolis-coupled Markov Chain Monte 
Carlo (MCMC) in MrBayes 3.2 [42]. All partitions were 
unlinked allowing each partition to have its own set of 
parameters. The analysis was run with four chains for 
10 000 000 generations with a sample frequency of 1000 



and a burn-in value of 25%. The parameter estimations 
were checked in Tracer v 1.5. [44]. 

All analyses were repeated twice to ensure that the 
final trees converged to the same topology and conver- 
gence of runs was monitored with the statistics provided 
by MrBayes 3.2. The final tree was visualized in FigTree 
1.3 and annotated with Adobe Illustrator CS5. 

To ensure that the MCMC analyses had run long 
enough such that tree topologies were sampled in 
proportion to their true posterior probability distribu- 
tion, we used AWTY [45,46] for graphical visualization 
of MCMC convergence. 

Dating analysis 

A calibrated tree was obtained using the topology re- 
trieved from MrBayes in a Bayesian MCMC analysis with 
BEAST 1.7.5 [44]. We used the same partitions as in the 
MrBayes analyses. Since Beast does not allow the revers- 
ible jump MCMC the General Time Reversible substitu- 
tion model with gamma and invariant sites (GTR + T + I) 
was selected. The uncorrelated lognormal clock was used 
to relax the equal rates constraint across the tree that a 
strict clock enforces [47]. 

Our primary calibration point was based on the fossil 
of Acilius florissantensis Wickham, 1909. The fossil is 
from the Florissant formation in Colorado, USA, which 
occurs on the boundary between Oligocene and Eocene, 
and has been dated to 34.07 +/- 0.10 Ma [48]. Based on 
the description of the fossil it is possible to either associ- 
ate the fossil with the monophyletic Nearctic clade of 
Acilius [6] or to the whole genus. The narrower body- 
shape is in agreement with the Nearctic clade (as is the 
location of the formation), and Wickham [49] mainly 
compared the fossil with Nearctic A. semisulcatus Aube, 
1838. But that body-shape is also found in one of the 
Palearctic species (A. duvergeri Gobert, 1874) and it is 
ambiguous if this is the apomorphic or plesiomorphic 
state in the genus. The fossil is a ventral impression of a 
male, whereas a dorsal female impression could have been 
more informative. Because of this uncertainty in place- 
ment we tested the effect of using it both as a stem-node 
constraint of Acilius and as a crown-node constraint of 
Acilius, the latter appropriate if the fossil can be assigned 
to the Nearctic clade. The prior on the age of the node 
was set to a lognormal distribution with an offset at 
34,0 Ma (mean = 7, Std = 5,0), which gave a distribution 
with median at 40 Ma and 95% of the prior distribution 
between 36 and 50 Ma. As a fossil only represents a mini- 
mum age of a node and the real age is likely to be some 
amount older followed by a declining tail of probability 
further back, this lognormal prior captures more or less 
our prior beliefs [50,51]. 

We also performed the same analyses using an exponen- 
tial prior (offset = 34 Ma, mean = 5, 95% of the distribution 
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between 34 and 52 Ma) instead of a lognormal prior (see 
[50] for a discussion of the use of exponential versus 
lognormal distribution to model fossil calibration uncer- 
tainty). The results were very similar compared to the dif- 
ference of using the fossil on either the stem or crown 
node, why only the lognormal and the node of calibration 
will be discussed further. 

Proposals for updating the topology in the MCMC 
chain were inactivated so that only branch lengths 
were estimated in this analysis. The analysis was run 
for 30 000 000 generations sampled every 3000 gene- 
rations with a burn-in value of 10%. We used Tree- 
Annotator to calculate the maximum clade credibility 
tree and visualized it in FigTree. 

Biogeographic analyses 

We used Bayesian Binary Method (BBM) implemented 
in RASP (Reconstruct Ancestral State in Phylogenies) [52] 
to reconstruct the ancestral areas of Aciliini. The current 
distribution areas of species (Figure 1) were obtained 
from Nilsson [1] and coded as follows: A (Nearctic), B 
(Neotropical), C (Afrotropical), D (Oriental), E (Australian), 
F (Palearctic). The analysis was conducted on the dated tree 
based on the primary calibration point as a crown group 
constraint of Acilius with the lognormal prior distribution. 
The outgroup taxa were removed from the analyses, since 
the model assigns virtual outgroups to the phylogenetic 
tree prior to the start of the analyses. 

The four MCMC chains were run simultaneously for 
10 000 000 generations. The state was sampled every 
1000 generations. State estimation was run under the 
F81 + G model for the BBM analysis with wide root dis- 
tribution to allow assigned outgroup to occur in all areas 
occupied by the ingroup. This is the most complex model 
implemented in RASP, with properties expected to yield 
realism to reconstruct ancestral distribution. The ma- 
ximum number of areas for this analysis was kept to six. 

Phylogenetic signal and saturation 

We applied the phylogenetic signal method of Townsend 
[53,54], modified to exclude the normalization equation, 
to determine the Phylogenetic Informativeness (PI) for 
each gene fragment and each codon position in our 
dataset. We followed the procedure described by Malm 
et al. [19] and Klopfstein et al. [55] and calculated site 
rates in HYPHY 2.1.2 [56], derived from an ultrametric 
tree with branch lengths proportional to time. PI values 
were estimated in R (www.r-project.org), which allowed 
us to visualize the evolutionary rates of each codon po- 
sition over the evolutionary time of our phylogeny [19,53]. 
We also estimated PI average for each gene codon position 
by dividing it by gene fragment length for comparison 
among the genes. 



Saturation plots for each gene and codon position were 
produced by retrieving pairwise uncorrected p-distances, 
and plotting them against inferred branch-length distances 
on the tree with the highest likelihood in the tree sample 
from the Bayesian MCMC analysis [55]. 

Results 

Gene sequence variation 

The combined and aligned data set included 6095 bp of 
DNA sequences. The longest fragments were obtained 
from CAD, 2008 bp, followed by COI 3 'end (821 bp), 
COII (684 bp), COI 5 'end (671 bp), 28S (670 bp), Wnt 
(466 bp), 16S (447 bp) and H3 (328 bp). Length variation 
in gene sequences obtained during this study was min- 
imal. There were no introns identified in the alignments 
but we found a highly variable section within the first 
150 bp from the 5' end of the CAD fragment. 

The combined dataset included 3926 conservative sites 
and 1862 were parsimony-informative. The proportion 
of variable sites in nuclear and mitochondrial protein- 
coding genes were similar (28-38%), while for the riboso- 
mal genes 16S and 28S it was 25 and 8%, respectively. 
The 3 rd codon position among all protein coding genes 
was the most parsimony informative, about 80% of all 
characters for each gene (Figure 2). Partition Finder di- 
vided the dataset into four partitions: 1) 16S, 2) 28S, 1 st 
and 2 nd codon position of mitochondrial and nuclear 
genes; 3) nuclear genes 3 rd codon position; 4) mitochon- 
drial genes 3 rd codon position (see Additional file 3: 
Table S3). 

Phylogenetic relationships 

The Bayesian analysis of the combined data resulted in a 
well-resolved and overall highly supported phylogeny 
(Figure 3). The ingroup Aciliini + Eretini was strongly 
supported (posterior probability, pp = 1.0) and each tribe 
was resolved as monophyletic with high support (pp = 
1.0). In Eretini, the Australian Eretes australis (Erichson, 
1842) was recovered as sister to the remaining three taxa 
in Eretes (pp = 1.0). In Aciliini, all six genera were reco- 
vered as monophyletic with high support (pp = 1.0). 

The genus Thermonectus was identified as sister group 
to the rest of Aciliini with high support (pp = 1.0), and was 
divided into two main clades. The two Afrotropical ge- 
nera, Aethionectes and Tikoloshanes, were recovered as 
sister clades with moderate support (pp = 0.78) and basal 
to the rest of Aciliini excluding Thermonectus (pp = 0.96). 

The remaining Aciliini was divided into two main 
clades, a Holarctic clade {Acilius + Graphoderus) and an 
Australasian- Afrotropical clade {Rhantaticus + Sandracot- 
tus), both with high support (pp = 1.0). The internal reso- 
lution within the genera was fully resolved, but, in some 
cases, especially within Sandracottus, with weak support. 
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Rhantaticus, probably representing a species complex, was 
divided into an Afrotropical and an Australasian clade. 

The Holarctic clade consisted of Graphoderus (pp = 1.0) 
and Acilius (pp = 1.0) with all internal branches highly 
supported except three. In Graphoderus the Nearctic spe- 
cies G. liberus was sister to the rest of Graphoderus and 
two other Nearctic taxa form the following consecutive 
branches, strongly suggesting a Nearctic origin for the 
genus. The genus Acilius was highly supported (pp = 1,0) 
and divided into two main clades, a Nearctic (pp = 1.0) 
and a Palearctic (pp = 1.0). 

The alternative partitioning scheme with six partitions 
resulted in an identical intergeneric topology, and only 
differed at one position within Thermonectus and at one 
position within Graphoderus, The sister group relation- 
ship of Tikoloshanes and Aethionectes which had only 
moderate support in the four partition analysis was given 
stronger support in the six partition analysis (pp = 0.98). 

To explore the contribution of CAD we ran separate 
analyses with CAD alone and all data except CAD, based 
on the same model of evolution described above. CAD 
alone, which makes up 31% of the entire dataset (35.5% 
of parsimony informative sites), resulted in a topology 
identical to the combined analysis in terms of tribal and 
intergeneric resolutions (Figure 4). The support values 
among genera were even higher for some nodes. How- 
ever, species resolutions in some genera were lower. For 
example, Palearctic species in the genus Acilius and 
resolution within the Rhantaticus complex collapsed 
into polytomies. 

Analysis of all data except CAD resulted in a different 
topology (Figure 5). The intergeneric resolution con- 
flicted with the resolution from CAD alone and from 
the combined dataset. Aciliini was divided into two 
main clades: 1) -Thermonectus was grouped as sister to 



Rhantaticus + Sandracottus, 2) -Aethionectes was identi- 
fied as sister group to the genera Graphoderus, Acilius 
and Tikoloshanes. Tikoloshanes was unexpectedly nested 
inside an otherwise Holarctic clade (pp = 0.81). 

Time of divergence 

The estimation of divergence time using the fossil as a 
crown-node constraint on Acilius suggested congruent 
ages for the basal nodes with relevant Gondwana break- 
up ages (Figure 6). The basal divergence of Aciliini and 
Eretini took place in the mid-Cretaceous around 109 Ma 
(95% highest posterior density = 137-88 Ma). The first 
branching event within Aciliini was estimated at 92 Ma 
(114-74 Ma) and involved the divergence of Neotropical 
Thermonectus from the rest of Aciliini. The second main 
branching event was the split between Afrotropical gen- 
era Aethionectes and Tikoloshanes and the ancestor of 
the remaining Aciliini (excluding Thermonectus) about 
81 Ma (101-65 Ma). Aethionectes and Tikoloshanes di- 
verged at 69 Ma (88-54 Ma), Graphoderus and Acilius at 
63 Ma (78-51 Ma), and Rhantaticus and Sandracottus at 
56 Ma (71-43 Ma). 

Using the fossil calibration as an Acilius stem- node 
constraint gave significantly younger ages not congruent 
with Gondwana vicariance events. For example the split 
between Neotropical Thermonectus and the rest of 
Aciliini was estimated at 58 Ma (70-49 Ma), Graphoderus 
and Acilius at 40 Ma (46-36 Ma), and Rhantaticus and 
Sandracottus at 35 Ma (43-28 Ma). 

Biogeographical analysis 

The biogeographical analysis favoured an Neotropical 
(59%) over an Afrotropical (23%) origin of the tribe 
Aciliini (Figure 7). So optimized, Thermonectus originated 
in South America and later dispersed to the Nearctic region 
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Figure 3 Phylogeny from Bayesian analysis of all eight combined gene fragments. Support values next to the nodes are Bayesian 
posterior probabilities. 



with is in line with the diversity being greatest in the 
Neotropical part of the New World. However, the 
ancestral region for Aciliini excluding Thermonectus 
remained concentrated in the Afrotropical region (80%) 
where Aethionectes and Tikoloshanes are found. The 
deepest divergence within Aciliini is hence between an 



ancestor in South America and an ancestor in Africa, 
consistent with a Gondwana vicariance event. This was 
followed by a series of dispersals to the Palearctic, 
Oriental, Nearctic and Australian regions during the 
Cenozoic. The ancestor of the Holarctic Graphoderus + 
Acilius was optimized to be of Nearctic origin (91%) 
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Figure 4 Phylogeny from Bayesian analysis of the single CAD gene fragment. Support values next to the nodes are Bayesian 
posterior probabilities. 



with later dispersals to the Palearctic region within each 
genus independently. The ancestral distribution of the 
ancestor of Eretini was inferred to most likely be the 
Australian region (50%). 

Phylogenetic informativeness 

The phylogenetic informativeness (PI) curves, which at- 
tempt to visualize phylogenetic information over time 
for a given dataset, showed high peaks at ages younger 
than 20 Ma for the 3 rd codon positions of all mitochon- 
drial protein-coding genes (Figure 8a-c). This indicates a 
partition most informative for the radiation within genera. 



The 3 r codon positions of the nuclear protein-coding 
genes showed the highest PI values between 35-50 Ma 
and a trailing slow decline over deeper time. For the 1 st 
and 2 nd codon positions PI increased slowly with time 
until 30-50 Ma, and was overall lower for both nuclear 
and mitochondrial genes and more extended over time. 

The original Townsends PI profile (Figure 8a) gave 
more narrow peaks than the Modified Townsends PI 
profile (Figure 8b), suggesting higher level of "noise" in 
the data, which can mislead phylogenetic analysis [56]. 
In the latter analysis without normalization (Figure 8b) 
these peaks were more extended, relatively lower, and 
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Figure 5 Phylogeny from Bayesian analysis of all combined gene fragments, but excluding CAD. Support values next to the nodes are 
Bayesian posterior probabilities. 



slower evolving positions surpass faster ones at deeper 
time-scales (for example, CAD 1 st codon position shows 
higher PI at 95 Ma than 3 rd codon positions of all mito- 
chondrial protein-coding genes). Third codon position 
of CAD sticks out as the most informative partition in 
the dataset. This is partly because this gene fragment is 
three times longer than the other genes. However average- 
modified Townsends profile displays a PI plot where 
length differences have been taken into account for com- 
parison across genes and codon positions (Figure 8c). This 
probably gives the fairest view of comparing partitions and 
it clearly shows that, 1) most information is in the 3 rd 
codon positions and 2) nuclear 3 rd codon positions are su- 
perior to mitochondrial per aligned base at ages older than 
25 Ma. CAD reached the highest PI peak at about 35 Ma, 
but was surpassed by Wnt at deeper time-scales. 



Gene saturation 

Saturation plots for each gene and codon position 
(Figure 9) corresponded well with the Pi-modified 
(Figure 8b-c) profile plots. The 3 rd codon position of 
mitochondrial protein-coding genes showed a high level 
of saturation (Figure 9). In the PI plots (Figure 8a) this 
can be identified by curves with narrow peaks followed 
by steep declines. The 3 rd codon positions of nuclear 
protein-coding genes were slightly saturated (CAD, H3) 
or not saturated at all (Wnt) (Figure 9). No saturation 
was detected in the 1 st and 2 nd codons in either mito- 
chondrial or nuclear protein-coding genes, suggesting 
conservative phylogenetic information in these parti- 
tions (Figure 9). In the PI plots this is revealed as low 
but increasing values backwards in time and without 
any clear peaks or declines. 
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Figure 6 Timetree of Aciliini. Time-calibrated Bayesian phylogeny obtained with BEAST based on the fossil Acilius florissontensis used as a 
crown-node constraint on the genus Acilius. Horizontal bars at the nodes indicate 95% highest posterior density intervals for the age of the 
corresponding node. Maps (a-c) at scale bar represent the position of continents at the time (modified from [61]), colored following hypothetical 
ancestral distributions of the clades obtained from RASP. Map (a): ancestral distribution of Thermonectus in South America and of Tikoloshanes + 
Aethionectes in Africa. Map (b): ancestral distribution of Graphoderus + Acilius in Euramerica and of Sandracottus + Rhantaticus in Asiamerica 
appears. Map (c): current distribution of genera {Tikoloshanes, Aethionectes and Rhantaticus overlap in Africa). 



Discussion and conclusion 

Phylogeny 

In our analyses, Aciliini was always strongly supported 
as monophyletic to the exclusion of Eretini (all analyses 



and partitioning schemes). This is consistent with mor- 
phological evidence, in particular the aciliine synapo- 
morphy of having both metatibial spurs apically bifid 
[12]. The weakly supported alternative of a paraphyletic 
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Figure 7 Ancestral distribution inferred from Bayesian analysis with RASP. Pie charts represent ancestral distributions as probabilities coded 
as follows: A (Nearctic), B (Neotropical), C (Afrotropical), D (Oriental), E (Australian), F (Palearctic). 
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Figure 8 Phylogenetic Informativeness (PI) plots for all data partitions, (a) Ploriginal, (b) PI modified, (c) Plaverage-modified. These analyses 
differ in the use of a normalization step in Ploriginal, which is removed in Plmodified and Plaverage-modified. The latter differ from Plmodified in 
averaging PI values over the gene length. The X-axis denotes time from 100 Ma to present, while the Y-axis denotes Phylogentic Informativeness. 
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Figure 9 Saturation plots of genes and codon positions. Uncorrected p-distances are shown on the y-axis. X-axis represents the pairwise 
distances as inferred on the tree recovered from the single-gene analysis. 



Aciliini with Eretini nested within [14] can therefore be 
rejected with the present dataset Eretini is resolved as 
the sister clade of Aciliini, also corroborating Miller [12]. 
This clade (Aciliini + Eretini) is supported by the syn- 
apomorphy of having the posterodorsal series of setae 
on metatibia arranged in a strongly oblique linear series 
with bases of setae contiguous [12]. The distinctive 
shape of the larvae is also a likely synapomorphy for 
Aciliini + Eretini and the two pairs of anterior camera- 



like stemmata could be a further synapomorphy but 
needs to be examined for more taxa [16]. The mono- 
typic Eretini was also strongly supported as mono- 
phyletic but this has never been questioned since the 
group is well characterized by several distinctive syn- 
apomorphies (see [12,15]). 

We recovered a fully resolved backbone and, except in 
one case, strongly supported relationships between all 
genera within Aciliini. Three alternative hypotheses had 
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been previously identified, i) sister group relationship 
between Acilius and Thermonectus [9,12,13] ii) Acilius 
sister to all other Aciliini [7,11], and iii) Acilius and Gra- 
phoderus as sister taxa [14]. Our results unequivocally 
support the third hypothesis, i.e. that Acilius and Gra- 
phoderus are sister taxa forming a derived Holarctic 
clade, corroborating the analyses of Ribera et al. [14]. 
The Thermonectus + Acilius hypothesis could be rejected 
and we found instead the New World Thermonectus to 
be the sister to all other Aciliini. This is a new hypothesis 
that prompts the investigation of potential morphological 
synapomorphies for Aciliini excluding the Thermonectus 
clade. A labial palp with 4-6 spines in the larvae could 
potentially be a synapomorphy but since Eretes has four 
spines it is more likely that the bispined labial palp is a 
synapomorphy for Thermonectus [57]. Since Thermonec- 
tus is the most species-rich genus its sister-group place- 
ment is in fact supported by the quite common pattern 
that clade diversity and clade ages are positively corre- 
lated. It is worth noting that excluding CAD from the 
dataset resulted in a clade in which Thermonectus is sister 
to Oriental- Australian genera Rhantaticus + Sandracottus. 
The sister group relationship between Rhantaticus and 
Sandracottus was found in all our analyses as well as in 
those by Ribera et al. [14]. These are also supported by the 
synapomorphy of reduced metacoxal lines on the meta- 
coxal process [5,11]. 

The genus Tikoloshanes, includes the single enigmatic 
and rarely collected species T eretiformis, which is 
superficially similar to Eretini [5]. Its placement was 
previously untested. Our conclusion of a monophyletic 
Aciliini already precludes a close relationship of Tiko- 
loshanes with either Eretini or Hydaticini, with which 
the genus bears some similarity based on the arrange- 
ment of the metatibial series of spines, as noted by 
Omer-Cooper [5]. Despite the mixed characters dis- 
played in Tikoloshanes, Omer-Cooper [5] came to the 
conclusion that Tikoloshanes belong to Aciliini with an 
affinity to the other Afrotropical aciliine genus Aethio- 
nectes. Our results corroborate this hypothesis, even 
though it was the only intergeneric node not supported 
with a posterior probability greater than 0.95 in the 
four-partition analysis (0.98 in the six partition analysis). 
Again, the placement changed when CAD was removed 
resulted in Tikoloshanes nested within the Holarctic 
clade, a very unlikely conclusion. 

The effect of removing the 2000 bp CAD fragment 
should be seen in the light of the data partitions that 
come to dominate the phylogenetic signal in the new 
dataset (c.f. Figure 3b). The remaining subset becomes 
dominated by the mitochondrial genes, since the other 
nuclear genes are relatively shorter. Most of the phylo- 
genetic signal from the mitochondrial genes is provided 
by the clearly saturated third codon positions (Figure 9). 



Placement of a single long branched taxon like Tiko- 
loshanes is, likely to be more uncertain when based on a 
higher proportion of saturated data. 

Biogeography and divergence time 

The biogeographical distribution together with the 
phylogeny of Aciliini and Eretini firmly places the origin 
of Aciliini in the Southern Hemisphere. The basalmost 
clades are all optimized to Australia (Eretes), South 
America (Thermonectus) and Africa (Aethionectes + Tiko- 
loshanes). The present distribution could be explained by 
a vicar iance event following the breakup of Gondwana. 
West Gondwana (South America and Africa) and East 
Gondwana (India, Madagascar, Australia) was the first 
major break-up event at about 165-130 Ma [58,59]. The 
split between ancestral Eretini (optimized to Australian re- 
gion) and ancestral Aciliini (optimized to S. America and 
Africa) would follow such a pattern and the age using the 
fossil calibration on the Acilius crown-node was estimated 
to an interval encompassing that geological age: 95% 
HPD = 138-88 Ma. Likewise, an early lineage split within 
Aciliini would have followed the second major break-up in 
West Gondwana between South America (leading to the 
divergence of Thermonectus in the Neotropical region) 
and Africa when the South Atlantic ocean started to open 
at about 135 Ma with land connections likely until around 
105 Ma [59]. With the same calibration point this split in 
the phylogeny was estimated to an interval allowing for 
a geological vicariance event explanation: (114-74 Ma). 
At some point the ancestor of the Australasian (Sandra- 
cottus + Rhantaticus) and Holarctic (Acilius + Grapho- 
derus) clades dispersed out-of- Africa, and this happened 
long before the Tethys sea actually closed and a land 
connection was established [60]. Most Aciliini being 
capable of flight and e.g. Rhantaticus having colonized 
islands like Mauritius and New Caledonia in recent 
times, this seems reasonable. The out-of-Africa node 
was dated to 100-65 Ma and during this time Laurasia 
was divided into an Euramerican and an Asiamerican 
continent, with a number of smaller continental frag- 
ments between these and Africa [61]. The split between 
the ancestor of Holarctic Graphoderus + Acilius, opti- 
mized to have a Nearctic origin, and the ancestor of 
Australasian Rhantaticus + Sandracottus, most probably 
with a Palearctic/Oriental origin, were dated to 59- 
90 Ma. The Turgai Strait separated eastern Palearctic 
from Euramerica during this period [61] and the two 
clades seem to have their origin on respective sides. 

However, estimations using the fossil as an Acilius 
stem-node calibration dated the relevant nodes of the 
tree to be too young for some of these events. The 
Aciliini-Eretini split is too young to be explained by the 
vicariance event of western and eastern Gondwana 
breaking up (83-57 Ma). The split between Neotropical 
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Thermonectus and the rest of Aciliini is too young for 
the vicariant breakup of South America and Africa 
(70-49 Ma). 

Compared to the crown-group calibration this is a less 
parsimonious solution, as it requires more transoceanic 
dispersals events to be invoked. That said, we agree with 
de Quieroz [62] that we have previously underestimated 
how geological time scales can make the most improb- 
able dispersal event probable. For increased precision 
around the evolutionary history of Aciliini, future studies 
are needed which includes multiple primary calibration 
points and takes the full uncertainty into account through 
the priors. Since fossils in this group are scarce this will 
probably have to be done at higher subfamily, family or 
superfamily level. 

Phylogenetic informativeness in CAD 

In recent years CAD has become commonly used for re- 
solving both shallow and deep evolutionary relationships 
of insects [19,27,63]. Evaluation of the performance of 
different gene fragments is often done by comparing 
single gene tree reconstructions with a total evidence 
hypothesis or a hypothesis derived from accumulated 
previous work [27]. But the relative phylogenetic infor- 
mativeness of genes can also be more directly quantified 
over time in relation to its evolutionary rate [53,54,64]. 
While the extrapolation of Townsend s original formula- 
tion of the method [39] beyond the four-taxon case has 
been questioned [64], modifications have been proposed 
that seem to more accurately capture the effect of sa- 
turation [19]. Our analyses suggested variable phyloge- 
netic informativeness for gene-and codon partitions 
over the time-scale in this study. CAD contributed sub- 
stantially to the resolved and highly supported back- 
bone of the phylogenetic tree as evident both from PI 
plots (Figure 8ab) and from comparing the single gene 
tree with the total evidence analysis. Analyzed alone 
CAD recovered both tribes and all genera as monophy- 
letic and resolves the deeper nodes in agreement with a 
Gondwana breakup influence. This confirms previous 
studies in Diptera [17] and Hymenoptera [56] where like- 
wise CAD alone successfully resolved Mesozoic-aged di- 
vergences in agreement with independent evidence. An 
evaluation of nuclear genes in Coleoptera [27] also scored 
CAD as the best performing gene overall among the ones 
compared, but CAD failed to recover some well-supported 
Mesozoic-aged clades. This might have had more to do 
with a poor taxon sampling for this scale and the known 
effects of taxon sampling [65,66]. For the reconstruction of 
the Trichoptera tree of life, CAD was surpassed in phylo- 
genetic Informativeness by RNA Polymerase II (POL) at 
deeper time scales [19]. 

At 2008 bp, CAD was the longest gene in our dataset, 
more than three times longer than mitochondrial genes 



and almost five times longer than other nuclear genes in 
our dataset. As a consequence, CAD was our most in- 
formative gene fragment. To compare its informative- 
ness per site with other genes, we averaged the modified 
Townsend s PI profile [19] by dividing PI values with the 
gene length for each gene and codon position [54]. The 
Average-modified PI profile indicated that the 3 rd codon 
position of CAD was the most informative between 15 
and 50 Ma closely followed and surpassed by the 3 rd 
codon position of Wnt over deeper time. This mirrors 
the comparison with POL [19], which begs for a future 
comparison between Wnt and POL over deeper Meso- 
zoic timescales. Moreover, the Average-modified PI plot 
identified 3 rd codon positions of nuclear genes in general 
as more informative already for relationships older than 
15 Ma compared to 3 rd codon positions in mitochon- 
drial genes. 

It has been reported that CAD 3 rd codon position 
demonstrates heterogeneity in base composition poten- 
tially indicating saturation which can mislead phylogen- 
etic analyses [19,37,67]. In our analysis, 3 rd codon 
position of CAD (as well as the 3 rd codon positions in 
Histone 3) was slightly saturated but far less than in 
mitochondrial genes. Wnt did not show any signs of 
saturation, even at 3 rd codon positions, showing great 
promise as a marker for deeper relationships but is per- 
haps underused today due to shortage of primers and 
protocols to amplify longer fragments. The Wnt gene is 
also known to produce paralogs for some insect groups 
[68] and can present alignment problems [27]. The 
466 bp target fragment in this study showed no evidence 
of paralogs, insertions or deletions and was unproblem- 
atic to align. The CAD alignment in contrast, displayed 
a markedly highly variable section located at the 5' end, 
including a 3-9 bp long indel, but which translated 
correctly to amino acids. This variable section, part of 
the small chain of CPS [17,37], provides CAD with 
phylogenetic information also at younger levels e.g. 
within genera. 
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